Consistency of Cross Validation for Comparing Regression Procedures

نویسندگان

  • Yuhong Yang
  • Y. YANG
چکیده

Theoretical developments on cross validation (CV) have mainly focused on selecting one among a list of finite-dimensional models (e.g., subset or order selection in linear regression) or selecting a smoothing parameter (e.g., bandwidth for kernel smoothing). However, little is known about consistency of cross validation when applied to compare between parametric and nonparametric methods or within nonparametric methods. We show that under some conditions, with an appropriate choice of data splitting ratio, cross validation is consistent in the sense of selecting the better procedure with probability approaching 1. Our results reveal interesting behavior of cross validation. When comparing two models (procedures) converging at the same nonparametric rate, in contrast to the parametric case, it turns out that the proportion of data used for evaluation in CV does not need to be dominating in size. Furthermore, it can even be of a smaller order than the proportion for estimation while not affecting the consistency property.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cross Validation for Comparing Multiple Density Estimation Procedures

We demonstrate the consistency of cross validation for comparing multiple density estimators using simple inequalities on the likelihood ratio. In nonparametric problems, the splitting of data does not require the domination of test data over the training/estimation data, contrary to Shao (1993). The result is complementary to that of Yang (2005) and Yang (2006).

متن کامل

Comparing Learning Methods for Classification

We address the consistency property of cross validation (CV) for classification. Sufficient conditions are obtained on the data splitting ratio to ensure that the better classifier between two candidates will be favored by CV with probability approaching 1. Interestingly, it turns out that for comparing two general learning methods, the ratio of the training sample size and the evaluation size ...

متن کامل

Overview of the validation procedures for a vaccine production: from R&D level to the pre-qualification stage

Just like any other process, vaccine manufacturing procedures are defined as a series of interrelated functions and activities using a variety of specified actions and equipment designed to produce a defined product. To assure the reproducibility and consistency of such processes, they must be carried out using validated equipment and under the established procedures that meet all the acceptanc...

متن کامل

Consistency Properties of Model Selection Criteria in Multiple Linear Regression

This paper concerns the asymptotic properties of a class of criteria for model selection in linear regression models, which covers the most well known criteria as e.g. MALLOWS' Cp, CV (cross-validation), GCV ( generalized cross-validation), AKAIKE's AIC and FPE as well as SCHWARZ' BIC. These criteria are shown to be consistent in the sense of selecting the true or larger models, assuming i.i.d....

متن کامل

Local M - Estimation of Regression Function

In this article, we investigate a robust version of local linear regression smoothers for stationary and censored stochastic processes by using M-type local polynomial techniques and transformations. Under some regularity conditions, we establish the weak and strong consistency as well as the asymptotic normality of proposed estimators. We propose an easily implemented bandwidth selection crite...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007